Attention mechanism

See Transformer model, BERT, Recurrent neural network

Compared with Sentence embedding approach, attention mechanism allows to retain information from longer sentences. The context vector is generated dynamically by having shortcuts to words in the input sentence.

Variations

Library and code

Tutorials and articles

References